Back

Human Genetics

Springer Science and Business Media LLC

Preprints posted in the last 90 days, ranked by how well they match Human Genetics's content profile, based on 25 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit.

1
Tracing the origin of Finnish gelsolin amyloidosis using haplotype sharing trees

Rautila, O. S.; Atula, S.; Mustonen, T.; Schmidt, E.-K.; Valori, M.; Colombo, R.; Kere, J.; Kaivola, K.; Tienari, P. J.

2026-02-14 genetics 10.64898/2026.02.11.705340 medRxiv
Top 0.1%
7.0%
Show abstract

Finnish gelsolin amyloidosis (AGel amyloidosis) is an autosomal dominant systemic amyloidosis caused by GSN c.640G>A p.D187N (rs121909715) founder variant. The disease was first described in 1969, and it was hypothesized that the Finnish patients share a common ancestor dating back to the 14th century. The link between two Finnish regions with high AGel incidence (Kanta-Hame and Kymenlaakso) has been hypothesized to have occurred in 1365 by a settler moving from Kanta-Hame to Kymenlaakso. Here, we used haplotype sharing tree (HST) to analyze Finnish AGel amyloidosis haplotypes to trace the geographic origin of the variant. We also estimated the time from the most recent common ancestor (MRCA) using single nucleotide polymorphism and short tandem repeat data. The HST -based analyses leveraging AGel amyloidosis cohorts from different Finnish geographic regions indicated, that the variant more likely appeared first in Kymenlaakso, not Kanta-Hame, contrary to the original hypothesis. The MRCA estimates for Finnish AGel ranged from 15 to 40 generations using four different methods, the mean of all estimates (27 generations) dated back to the 14th century. Thus, the data supports the original hypothesis on the variants spreading temporally, but not geographically. These results illustrate the use of HSTs in the analysis of haplotype structures and in tracing the ancestry of a founder variant.

2
Short tandem repeats significantly contribute to the genetic architecture of metabolic and sensory age-related hearing loss phenotypes

Ahmed, S.; Vaden, K. I.; Dubno, J. R.; Wright, G.; Drogemoller, B.

2026-02-18 genetic and genomic medicine 10.64898/2026.02.17.26346449 medRxiv
Top 0.1%
6.2%
Show abstract

Age-related hearing loss (ARHL) is a progressive, bilateral decline in hearing ability that affects one in four individuals over 60 years of age worldwide. While previous genome-wide association studies (GWAS) have identified distinct single-nucleotide variants (SNVs) associated with metabolic and sensory ARHL phenotypes, the contribution of short tandem repeats (STRs) - a neglected yet important class of genetic variants - remains poorly understood. To address this gap, TRTools was used to impute STRs from a high quality, sequencing-derived SNV-STR reference panel to investigate the association between STRs and metabolic and sensory estimates. Heritability analyses revealed that while STRs contribute to estimates of both ARHL components, this class of variation plays a more important role in metabolic hearing loss (6%), which typically increases with age, compared to sensory hearing loss (4%). Further, the inclusion of this class of variant into GWAS analyses uncovered an association between a haplotype consisting of two missense variants (rs7714670 and rs6453022) and an intronic STR (chr5:73778077:A16) in ARHGEF28 (P=3.30x10-9), proving further insight into the variants driving this previously identified signal. Notably, burden analyses revealed that rare and longer repeats were associated with an increased risk of the metabolic phenotype and a reduced risk of the sensory phenotype. Functional annotation of significant and nominally significant STRs revealed potential effects on gene expression and splicing of nearby genes. Our findings provide the first evidence that STRs explain some of the missing heritability of ARHL phenotypes and create an STR resource for researchers to use in future analyses.

3
Ancestry-specific performance of variant effect predictors in clinical variant classification

Hoffing, R.; Zeiberg, D.; Stenton, S. L.; Mort, M.; Cooper, D. N.; Hahn, M. W.; O'Donnell-Luria, A.; Ward, L. D.; Radivojac, P.

2026-02-17 bioinformatics 10.64898/2026.02.14.705914 medRxiv
Top 0.1%
4.2%
Show abstract

Predicting the effects of genetic variants and assessing prediction performance are key computational tasks in genomic medicine. It has been shown that well-calibrated variant effect predictors can be reliably used as evidence towards establishing pathogenicity (or benignity) of missense variants, thereby rendering these variants suitable for use in (or exclusion from) the genetic diagnosis of rare Mendelian conditions. However, most predictors have been trained or calibrated on data that may not be sufficiently representative to lead to similar performance across all genetic ancestries. This raises questions about the responsible deployment of these tools to improve human health. To better understand the utility of computational predictors, we set out to assess their ancestry-specific performance in terms of accuracy and evidence strength according to the ACMG/AMP guidelines. First, we determined that the expected count of rare variants in an individuals genome and the allele frequency distribution of these variants are the key confounders when evaluating a predictors performance across different genetic ancestries. Second, we found that a predictors accuracy itself inversely correlates with the allele frequency of the rare variant. After stratifying according to allele frequency, we show that established methods for predicting the pathogenicity of missense variants have comparable performance levels across major ancestry groups. Our results therefore support the wide deployment of such models in the context of genetic diagnosis and related applications.

4
The prevalence of protein misfolding as a mechanism for hereditary deafness

Gogal, R. A.; Cox, G. M.; Kolbe, D. L.; Odell, A. M.; Ovel, C. E.; McCormick, K. I.; Hong, B.; Azaiez, H.; Casavant, T. L.; Smith, R. J. H.; Braun, T. A.; Schnieders, M. J.

2026-03-11 genetics 10.64898/2026.03.09.710547 medRxiv
Top 0.1%
3.9%
Show abstract

Hearing loss is the most common sensory deficit impacting [~]5% of the worlds population. The Deafness Variation Database (DVD) is a public resource of deafness variants, containing over 380,000 missense variants across 224 genes, with 303,577 classified as a variant of uncertain significance (VUS). To address the challenge of evaluating each deafness associated VUS, we evaluate a family of probabilistic frameworks to quantify the strength of computational evidence based on ACMG/AMP recommendations. First, CADD and REVEL are compared using Bayesian models parameterized using either a ClinVar 2019 dataset or labeled DVD variants. The REVEL model built using the DVD dataset demonstrates the best accuracy, sensitivity, and specificity. Incorporation of (in)tolerance to missense variation based on sorting each gene into three bins (tolerant, average, intolerant) shows that intolerant DVD genes are consistent with a higher prior probability of being pathogenic (25.7%) than average (10.7%) or tolerant (8.7%) genes. Finally, the impact of protein folding stability was incorporated using a 2D likelihood, which surpassed the simpler models while also offering a biophysical rationale for the disease mechanism. The protein folding-informed Bayesian model results in 28,866 prioritized VUSs reaching a posterior probability of pathogenicity above 98% with a false positive rate of only 0.14%. Overall, 54,752 missense variants are predicted to cause protein folding destabilization of greater than 1.0 kcal/mol, while 18,706 of the 28,886 prioritized VUS (65%) surpass this threshold. From these VUSs, we identify twelve probands where the patients genetic diagnosis is upgraded to likely pathogenic/pathogenic. We highlight two variants that cause clear structural disruption, demonstrating the impact of biophysical characterization on variant evaluation. Author SummaryWe investigate the impacts of single amino acid changes on protein structure and folding in the context of hearing loss. Hearing loss is the most common impairment of the main senses affecting nearly 5% of the worlds population. About 45% of people with hearing loss receive a diagnosis after targeted genetic testing. Here, we integrate biophysical data that quantifies the effect of a change to protein sequence on protein folding in combination with genetic data to improve our ability to identify protein amino acid changes that are likely to impact hearing. Our work leads to 12 patients receiving an upgraded diagnosis with their variant disrupting protein stability. Although the method is applied to hearing loss, it can be used for interpreting protein sequence changes in other disease contexts.

5
FRMPD4, a causal gene for intellectual disability and epilepsy, is associated with X-linked non-syndromic hearing loss

Liedtke, D.; Rak, K.; Schrode, K. M.; Hehlert, P.; Chamanrou, N.; Bengl, D.; Katana, R.; Heydaran, S.; Doll, J.; Han, M.; Nanda, I.; Senthilan, P. R.; Juergens, L.; Bieniussa, L.; Voelker, J.; Neuner, C.; Hofrichter, M. A.; Schroeder, J.; Schellens, R. T.; de Vrieze, E.; van Wijk, E.; Zechner, U.; Herms, S.; Hoffmann, P.; Mueller, T.; Dittrich, M.; Bartsch, O.; Krawitz, P. M.; Klopocki, E.; Shehata-Dieler, W.; Maroofian, R.; Wang, T.; Worley, P. F.; Goepfert, M. C.; Galehdari, H.; Lauer, A. M.; Haaf, T.; Vona, B.

2026-03-30 genetic and genomic medicine 10.64898/2026.03.27.26349271 medRxiv
Top 0.1%
3.7%
Show abstract

Abstract Background Understanding the phenotypic spectrum of disease-associated genes is essential for accurate diagnosis and targeted therapy. FRMPD4 (FERM and PDZ Domain Containing 4) has previously been associated with intellectual disability and epilepsy. However, its potential role in non-syndromic hearing loss has not been explored. Methods We performed genetic analysis in two unrelated families presenting with non-syndromic sensorineural hearing loss, identifying maternally inherited missense variants in FRMPD4. Clinical phenotyping included audiological assessment and evaluation for neurodevelopmental involvement. Cross-species expression analyses were conducted in Drosophila, zebrafish, and mouse. Functional characterization included quantitative evaluation of sound-evoked responses in Drosophila nicht gut hoerend (ngh) mutants, assessment of neuronal development and acoustic startle responses in zebrafish loss of function models, and morphological cochlear analyses with auditory brainstem response measurements in knockout mice. Results Three affected males from two unrelated families presented with prelingual, bilaterally symmetrical sensorineural hearing loss, with confirmed congenital onset in one individual and no evidence of neurodevelopmental abnormalities. Cross-species analyses demonstrated evolutionarily conserved expression of FRMPD4 in auditory structures. In Drosophila, quantitative analysis of sound-evoked responses in ngh mutants revealed impaired auditory function. Zebrafish loss of function models exhibited reduced neuronal populations in the otic vesicle and posterior lateral line, abnormal neuromast development, and diminished acoustic startle responses. In mice, Frmpd4 knockout resulted in high-frequency hearing loss and cochlear abnormalities consistent with the human phenotype. Conclusions Our findings expand the phenotypic spectrum of FRMPD4 to include non-syndromic sensorineural hearing loss and establish its evolutionarily conserved role in auditory function. These results have direct implications for genetic diagnosis and variant interpretation in patients with hearing loss.

6
A pilot genome-wide association study of ischemic heart disease with co-occurring arterial hypertension in a Kazakh cohort

Skvortsova, L.; Yergali, K.; Zhaxylykova, A.; Begmanova, M.; Mansharipova, A.

2026-03-23 genetic and genomic medicine 10.64898/2026.03.19.26348868 medRxiv
Top 0.1%
3.7%
Show abstract

Genome-wide association studies (GWAS) of ischemic heart disease (IHD) remain underrepresented in Central Asian populations. We conducted a pilot GWAS of IHD with co-occurring arterial hypertension in a Kazakh cohort to identify candidate loci for future replication. A case-control GWAS was performed in 451 individuals (236 cases and 215 controls). Genotyping was conducted using the Illumina Infinium Global Screening Array-24 v3.0. Association testing was performed using a logistic regression under an additive genetic model adjusted for age, sex and the first ten principal components (PC1 - PC10). Multiple testing correction was applied using the Bonferroni adjustment. As an additional analysis, knowledge-guided GWAS (KGWAS) followed by MAGMA gene-based testing was used to prioritize candidate genes. After quality control, 345 371 variants were tested. Two loci surpassed the Bonferroni-corrected genome-wide significance threshold: rs28898595 at the UGT1A locus (effect allele C; OR = 0.33, 95% CI = 0.23 - 0.49; p = 3.01x10-8) and rs28709059 in the intron region of the ACTR3C gene (effect allele C; OR = 0.4, 95% CI = 0.29 - 0.55; p = 4.08x10-8). Several additional loci showed suggestive evidence of association. In gene-level analysis, the CSMD1 gene demonstrated a significant association signal in MAGMA consistent with the European (p = 1.16x10-11) and East Asian (p = 9.07x10-11) LD reference panels. This pilot study identifies genome-wide significant loci (UGT1A, ACTR3C genes) and supports CSMD1 gene as a prioritized candidate gene for the complex phenotype of IHD associated with co-occurring arterial hypertension in the Kazakh cohort. These findings are preliminary and require replication in larger Central Asian cohorts and further functional validation.

7
Proteogenomic analysis of 5,411 plasma proteins in sickle cell disease patients

Groza, C.; Chignon, A.; Lo, K. S.; Bellegarde, V.; Bartolucci, P.; Lettre, G.

2026-04-07 genetic and genomic medicine 10.64898/2026.04.06.26350255 medRxiv
Top 0.1%
3.6%
Show abstract

There are few therapeutic options to treat patients with sickle cell disease (SCD), a blood disorder caused by mutations in the {beta}-globin gene that affects >7M individuals worldwide. Combining human genetics and high-throughput proteomics can help identify new drug targets. Here, we present results from a proteogenomic analysis of the plasma proteome in SCD patients. We measured the levels of 5,411 plasma proteins and tested their associations with common genetic variation in 343 SCD patients. After conditional analyses, we identified 560 protein quantitative trait loci (pQTL), including 58 (10%) that are novel. Many of these pQTL are not specific to SCD patients and associate with clinically relevant traits in non-SCD African Americans from the Million Veteran Program (e.g. hemoglobin concentration, triglycerides). The effect sizes of the pQTL is largely concordant between SCD and non-SCD individuals, although we found examples (e.g. APOL1, haptoglobin) with evidence of heterogeneity that suggests an interaction between the plasma proteome and the SCD genotype. Finally, we combine pQTL and genome-wide association study results for fetal hemoglobin (HbF) in a Mendelian randomization analysis to prioritize five proteins that may increase HbF production (ENPP5, LBP, NAAA, PT3X, ZP3).

8
An AI-Integrated Framework for Precision Genomics in Coronary Artery Disease Using Whole Exome and Phenotypic Data

UPPALURI, K. R.; CHALLA, H. J.; VEMPATI, K. K.; KADALI, L. N.; PALASAMUDRAM, K.; RAYALA, M.

2026-01-30 genetic and genomic medicine 10.64898/2026.01.28.26345099 medRxiv
Top 0.1%
3.5%
Show abstract

Coronary artery disease (CAD) is a multifactorial condition influenced by genetic, phenotypic, and environmental factors. Traditional risk prediction models fall short in capturing the polygenic complexity of CAD, particularly in underrepresented populations. This study presents SIGMA (Scoring Importance of Genes specific to disease using Machine learning Algorithms), a novel AI-powered framework that enhances CAD risk prediction by integrating genomic and phenotypic data. Our approach leverages GEMS (GeneConnectRx Evidence Metrics), an LLM-driven system to score 1772 CAD-associated genes, and CASCADE (Comprehensive Assessment of Sequence and Clinical Annotation Data Evaluation), a tiered variant scoring pipeline. Using whole exome sequencing (WES) data from 1,243 individuals (628 controls, 615 CAD cases), the model integrates age and gender as key non-modifiable phenotypes. Results show significant improvements in sensitivity (from 0.41 to 0.79), specificity (0.70 to 0.72), and AUC (0.59 to 0.81) when phenotype data are incorporated. Our findings highlight the potential of AI-integrated genomics for population-specific CAD risk stratification.

9
Leveraging the genetics of human face shape boosts the discovery of orofacial cleft risk loci

Herrick, N.; Goovaerts, S.; Manchel, A.; Lee, M. K.; Zhang, X.; Davies, A.; Carlson, J. C.; Leslie-Clarkson, E. J.; Lewis, S. J.; Marazita, M. L.; Cotney, J.; Claes, P.; Shaffer, J. R.; Weinberg, S. M.

2026-02-03 genetic and genomic medicine 10.64898/2026.01.30.26345139 medRxiv
Top 0.1%
3.5%
Show abstract

Several lines of evidence suggest that normal-range facial features and nonsyndromic orofacial clefts (OFCs) exhibit a shared genetic basis. Approaches designed to leverage this relationship hold the possibility of revealing new OFC risk loci by boosting discovery power. To test this idea, we applied a pleiotropy-informed GWAS method (cFDR-GWAS) with summary statistics from large, independent European GWASs of normal facial shape (n=4,680; n=3,566) and nonsyndromic cleft lip with or without cleft palate (nsCL/P, n=3,969). The cFDR approach identified 21 independent genomic loci significantly associated with nsCL/P, providing further evidence of the interconnected genetic architecture between these traits. The five original nsCL/P GWAS signals were detected and joined by nine additional loci previously implicated in other OFC association studies. The remaining seven loci represent new nsCL/P genomic regions, and three of these replicated (P < 0.05) in an independent nsCL/P cohort: ASPSCR1, MSX2, and RALYL. A relaxed 10% cFDR-GWAS threshold identified 15 more independent loci with comparable effect sizes to those detected at the strict 5% threshold, two of which replicated: FHOD3 and SMARCA2. Gene expression patterns in major cell types and spatial transcriptomics data highlighted our gene candidates roles in craniofacial development. In conclusion, the application of an empirical Bayesian strategy to draw on association signals from genetically related traits can boost the power to identify and prioritize OFC risk loci missed by agnostic gene mapping approaches. These results hold promise that the cFDR-GWAS approach may be able to enhance our understanding of the genetic architecture of other structural birth defects.

10
Polygenic risk scores enhance the identification of carriers of monogenic forms of idiopathic pulmonary fibrosis

Alonso-Gonzalez, A.; Jaspez, D.; Lorenzo-Salazar, J. M.; Delgado, A.; Quintero-Bacallado, A.; Ma, S.-F.; Strickland, E.; Mychaleckyj, J.; Kim, J. S.; Huang, Y.; Adegunsoye, A.; Oldham, J. M.; Maher, T. M.; Guillen-Guio, B.; Wain, L. V.; Allen, R. J.; Saini, G.; Jenkins, R. G.; Molina-Molina, M.; Zhang, D.; Kim Garcia, C.; Martinez, F. J.; Noth, I.; Flores, C.

2026-04-18 genetic and genomic medicine 10.64898/2026.04.16.26350967 medRxiv
Top 0.1%
3.5%
Show abstract

Background: Idiopathic pulmonary fibrosis (IPF) is a rare disease with a poor prognosis. Disease risk involves rare and common genetic variants. However, an inverse association have been described between them. Accordingly, IPF patients with a higher polygenic risk score (PRS) for IPF are less likely to carry rare deleterious variants and vice versa. Here, we evaluate weather PRS of IPF could serve as an additional criterion to patient prioritisation for rare variant discovery. Methods: We identified carriers based on the presence of rare qualifying variants (QVs) in genes linked to monogenic forms of pulmonary fibrosis in 888 IPF patients from the Pulmonary Fibrosis Foundation Patient Registry (PFF-PR). Genome-wide association study (GWAS) summary statistics from independent cohorts were used to construct a whole-genome PRS (WG-PRS) using a clumping and thresholding method (C+T) and a Bayesian method (SBayesRC). PRS were also derived from 19 known common sentinel IPF variants (Sentinel-PRS). Logistic regression models were used to evaluate associations between PRS and carrier status. Discriminatory performance was evaluated using area under the curve (AUC) analysis, and comparisons were made with DeLong test. Validation was performed in 472 IPF individuals from the UK PROFILE cohort. Results: IPF-PRS were strongly associated with the QVs carrier status: Odds Ratio [OR] 0.65 (95% Confidence Interval [CI] 0.53-0.79) for WG-PRSC+T, OR 0.71 (95% CI 0.59-0.86) for WG-PRSSBayesRC, and OR 0.77 (95% CI 0.63-0.94) for Sentinel-PRS. Adding WG-PRS to the patient personal clinical history improved the prediction of QVs carriers: AUC=0.62 for the clinical model, AUC=0.68 for WG-PRSC+T (DeLong test, p=9.54x10-4) and AUC=0.66 for WG-PRSSBayesRC (DeLong test, p=0.02). Adding of IPF-PRS to clinical variables correctly reclassified 22.8% of carriers when using WG-PRSC+T, 20.8% when using Sentinel-PRS, and 16.7% for WG-PRSSBayesRC. WG-PRSSBayesRC and the Sentinel-PRS also demonstrated improved prediction of QVs carriers in telomere-related genes in PROFILE. Conclusions: Incorporating IPF-PRS into a model based on the patient clinical history improves the identification of QVs carriers. Although the overall discriminatory power was moderate, these findings raise de the possibility of using WG-PRS as useful criterion for rare variant discovery in patients with IPF and enhance decision-making.

11
Constructing a Literature-Derived Database for Benchmarking Polygenic Risk Score Construction Methods with Spectral Ranking Inferences

Sebastian, C.; Yu, M.; Jin, J.

2026-03-03 genetic and genomic medicine 10.64898/2026.03.01.26347258 medRxiv
Top 0.1%
3.5%
Show abstract

Polygenic risk scores (PRSs) have emerged as a valuable tool for genetic risk prediction and stratification in human diseases. Over the past decade, extensive methodological efforts have focused on improving the predictive power of PRS, leading to the development of numerous methods for PRS construction. Benchmarking these various methods thus becomes an essential task that is crucial for guiding future PRS applications. While studies have benchmarked subsets of these methods on specific phenotypes and cohorts, the resulting evidence remains fragmented, with a lack of work that comprehensively assess the relative performance of the various PRS methods. In this study, we addressed this gap by systematically constructing a PRS method benchmarking database synthesizing published results from 2009 to 2025. We applied a spectral ranking inference framework with uncertainty quantification to rank 14 PRS methods that had been adequately compared against each other in the literature. We constructed rankings using two complementary sources: original method-development studies and applications/benchmarking studies. While the highest-ranked methods (LDpred2 and AnnoPred) and the lowest-ranked method (C+T) were consistently identified from both sources, the relative ordering of most methods showed moderate variability. We further constructed phenotype-specific rankings, providing more detailed insights into the robustness and phenotype-specific strengths of individual methods. Collectively, the overall and phenotype-specific rankings of the PRS methods, along with the curated benchmarking data from the literature, provide a dynamic and practical reference database that can continuingly be updated with emerging new PRS methods and published benchmarking results to guide future PRS applications.

12
PALM3 and hearing loss: a potential dual diagnosis interfering with novel gene discovery

Najarzadeh Torbati, P.; Hallbrucker, L.; Hofrichter, M. A. H.; Owrang, D.; Setzke, J.; Kilimann, M. W.; Hemmatpour, A.; Rajati, M.; Ghayoor Karimiani, E.; Haaf, T.; Vogl, C.; Vona, B.

2026-04-21 genetic and genomic medicine 10.64898/2026.04.20.26351093 medRxiv
Top 0.1%
3.0%
Show abstract

Hereditary hearing loss is highly genetically heterogeneous, with emerging overlap between genes implicated in early-onset and age-related hearing loss. We report a consanguineous family with autosomal recessive, non-syndromic hearing loss in which the proband harbors a homozygous splice-site variant in PALM3 (NM_001145028.2:c.314+1G>A) and a homozygous missense variant in OTOA. A minigene assay for the PALM3 variant demonstrated aberrant splicing with exon skipping, resulting in a frameshift and a large inframe deletion, both consistent with loss of function and impacting all known transcripts. While the organ of Corti from 12-month-old heterozygous Palm3 mice showed preserved overall architecture, published Palm3 knockout mice exhibit auditory dysfunction, supporting an auditory phenotype with loss of function. Although a dual molecular diagnosis cannot be excluded, the combined genetic, functional, and comparative data support PALM3 as a strong candidate gene for autosomal recessive hearing loss.

13
Distinct cochlear cell types associated with genetic susceptibility to sensory and metabolic hearing loss in older adults from the CLSA

Ahmed, S.; Vaden, K. I.; Dubno, J. R.; Drogemoller, B. I.

2026-02-18 genomics 10.64898/2026.02.17.706270 medRxiv
Top 0.1%
2.9%
Show abstract

Hearing loss is a heterogeneous condition that can be classified into different subtypes with diverse genetic and cellular components. To investigate the cochlear cell types underlying the genetic basis of sensory and metabolic components of age-related hearing loss (ARHL), we integrated human genome-wide association study data with mouse cochlear single-cell RNA sequencing data using the single-cell disease relevance score tool. These analyses revealed that genes associated with the sensory component of ARHL in older humans were most highly expressed in the hair cells, while genes associated with metabolic component of ARHL in older humans were most highly expressed in spiral ganglion neurons. To assess whether age-related transcriptional changes might influence these patterns, we performed age-stratified analyses. In younger mice, sensory hearing loss-associated genes revealed significant heterogeneity in expression in supporting cells within the sensory epithelium. In contrast, the greatest heterogeneity in the expression of metabolic hearing loss-associated genes was observed in intermediate cells of the stria vascularis in older mice. These findings provide evidence for the role of distinct genetic and cellular risk profiles for different ARHL subtypes, suggesting that prevention and therapeutic strategies may require targeting specific cell populations at different life stages.

14
XPOT Deficiency causes a human disorder through impaired tRNA nuclear export

von Hardenberg, S.; Niehaus, I.; Wiemers, A.; Rothoeft, T.; Schaeffer, V.; Huang, K.; Petree, C.; Phillipe, C.; Bruel, A.-L.; Warnatz, K.; Zamani, M.; Ahmadi, R.; Sedaghat, A.; Bahram, S.; Sedighzadeh, S.; Sareh, E.; Khalilian, S.; Landwehr-Kenzel, S.; Schwerk, N.; Abdulwahab, E.; Roesler, J.; Lin, S.-J.; Sabu, S.; Strenzke, N.; Sogkas, G.; Vona, B.; Varshney, G. K.; DiDonato, N.; Bernd, A.

2026-02-04 genetic and genomic medicine 10.64898/2026.01.28.26344748 medRxiv
Top 0.1%
2.6%
Show abstract

BackgroundThe transport of transfer RNAs (tRNAs) from the nucleus to the cytoplasm is a crucial step in the regulation of gene expression and protein synthesis. This process is mediated by specialized export molecules, among which XPOT (Exportin-t, XPO3) plays a central role by recognizing and transporting mature tRNAs through the nuclear pore complex. XPOT is not essential in RNA trafficking in the simple organisms, however the potential impact of XPOT deficiency in human health remains unresolved. MethodsWe identified eight patients from five unrelated families with rare biallelic germline variants in XPOT resulting in putative loss-of-function. Functional analyses were carried out in patient-derived fibroblasts, lymphoblastoid cells and zebrafish models. Ex vivo immunohistochemical stainings for Xpot were performed in the mouse cochlea. xpot knockout zebrafish models were generated to assess the morphology and hearing ability. ResultsAll patients presented with a uniform clinical phenotype that included increased susceptibility to infection, bronchiectasis, severe sensorineural hearing loss, developmental delay, and growth retardation. We demonstrated a complete absence of XPOT protein expression in three patient-derived cell lines. XPOT deficiency leads to disruptions in protein synthesis of the cytokine TNF pathway upon cellular stimulation. Additional XPO1 inhibition in XPOT deficient cells had little effect on cellular functions, suggesting alternative tRNA nuclear transporter pathways. Increased XPOT immunoreactivity was observed in type I spiral ganglion neurons and hair cells of the mouse cochlea, with enrichment in stereocilia. xpot knockout zebrafish model showed dysmorphic features, and reduced hearing, recapitulating key patient phenotypes. ConclusionsOur findings establish a direct connection between impaired XPOT-dependent tRNA export and human pathology. It illustrates that perturbations in nuclear export pathways lead to disease. It also raises the possibility that other nuclear transport receptors may play similarly underappreciated roles in human health and disease. The identification of XPOT as a disease-associated gene opens up new research directions and potential targets for therapeutic intervention.

15
Assessing the clinical significance of a novel rare variant in Loeys-Dietz Syndrome by combining AI-driven modelling and cell biology

Boukrout, N.; Delage, C.; Comptdaer, T.; Arondal, W.; Jemel, A.; Azabou, N.; Bousnina, M.; Mallouki, M.; Sabaouni, N.; Arbi, R.; Kchaou, S.; Ammar, H.; Hantous-Zannad, S.; Jilani, H.; Elaribi, Y.; Benjemaa, L.; Van der Hauwaert, C.; Larrue, R.; CHEOK, M.; Perrais, M.; Lefebvre, B.; Cauffiez, C.; Pottier, N.

2026-03-31 genetic and genomic medicine 10.64898/2026.03.30.26349510 medRxiv
Top 0.1%
2.5%
Show abstract

Loeys-Dietz syndrome (LDS) is an autosomal dominant connective-tissue disorder caused by genetic variants in TGF-{beta} pathway genes, most often TGFBR1/2. While pathogenic TGFBR2 genetic mutations usually cluster in the kinase domain and disrupt SMAD signalling, distinguishing with confidence those with functional impact on TGFBR2 function from rare benign genetic alterations represents one of the most important ongoing challenges for accurate genetic testing. Therefore, there is a pressing need to develop methods that can improve functional variant interpretation. Here, we describe and characterize the functional impact of a novel genetic variant in the TGFBR2 kinase domain (E431K), in a patient with the clinical diagnosis of syndromic genetic aortopathy. We assessed the structural and functional consequences of this variant using AI-driven molecular modelling and in vitro cell-based assays. A high-quality homology-based model of TGFBR2 was generated and computational mutagenesis based on the structural context and evolutionary conservation was used to forecast variant pathogenicity. Relative to wild type, the variant affects protein stability by disrupting intramolecular interactions and likely induces conformational changes that may affect kinase activity and thus TGF-{beta} signalling. This was experimentally confirmed by showing abnormal protein level and alteration of canonical TGF-{beta} pathway activation. Overall, our results establish that the E431K variant leads to aberrant TGF-{beta} signalling and confirm the diagnosis of Loeys-Dietz syndrome type 2 in this patient.

16
Calibration of in-frame indel variant effect predictors for clinical variant classification

Abderrazzaq, H.; Singh, M.; Babb, L.; Bergquist, T.; Brenner, S. E.; Pejaver, V.; O'Donnell-Luria, A.; Radivojac, P.; ClinGen Computational Working Group, ; ClinGen Variant Classification Working Group,

2026-04-18 bioinformatics 10.64898/2026.04.15.718599 medRxiv
Top 0.1%
2.4%
Show abstract

Insertions and deletions (indels) represent a substantial source of genetic variation in humans and are associated with a diverse array of functional consequences. Despite their prevalence and clinical importance, indels, particularly short in-frame indels, remain critically understudied compared to single nucleotide variants and are challenging to interpret clinically. While many computational predictors for missense variants have been rigorously evaluated and calibrated for clinical use, the clinical utility of tools for in-frame indels remains uncertain. To address this gap, we have calibrated in-frame indel prediction tools for clinical variant classification. We constructed a high-confidence dataset of in-frame indel variants ([&le;] 50bp) from clinical and population databases and estimated the prior probability of pathogenicity of a rare in-frame indel observed in a disease-associated gene, and of an insertion and deletion separately. Using a previously developed statistical framework based on local posterior probabilities, we then established score thresholds for eight computational tools, corresponding to distinct evidence levels for pathogenic and benign classification according to ACMG/AMP guidelines. All in-frame indel predictors evaluated here reached multiple evidence levels of pathogenicity and/or benignity, demonstrating measurable clinical value. However, these models consistently exhibited lower performance levels compared to missense predictors, highlighting the need for improved computational approaches for indel classification.

17
Features Influencing Diagnostic Yield of Exome Sequencing in the DECIPHERD Study in Chile

Moreno, G.; Rebolledo-Jaramillo, B.; Böhme, D.; Encina, G.; Martin, L. M.; Zavala, M. J.; Espinosa, F.; Hasbun, M. T.; Poli, M. C.; Faundes, V.; Repetto, G. M.

2026-02-22 genetic and genomic medicine 10.64898/2026.02.12.26345769 medRxiv
Top 0.1%
1.9%
Show abstract

BackgroundExome sequencing (ES) has become a key diagnostic tool for rare diseases (RDs). However, most evidence on ES performance comes from high-income countries and patients from European ancestry. In countries such as Chile, limited access to next generation sequencing amplifies health disparities and highlights the need to identify which patients are most likely to benefit from ES. MethodsThis study presents the second phase of the Chilean DECIPHERD project, in which we performed ES in a new group of patients with RDs presenting with multiple congenital anomalies (MCA), neurodevelopmental disorders (NDD), and/or suspected inborn errors of immunity. To identify clinical and demographic factors associated with an increased probability of obtaining an informative ES result, we conducted a logistic regression analysis, combining the results of the first and second phases of the project. We also objectively evaluated global ancestry measured using ADMIXTURE, as a potential factor. ResultsSixty-seven patients participated in this second phase of DECIPHERD with a median age of 6 years (range: 0-27); 55.2% were female, with an average ({+/-} s.d.) proportion of Native American ancestry of 0.615 {+/-} 0.18. Clinically, 52.2% presented with both MCA and NDD, and the rest had other phenotype combinations. An informative result, including pathogenic or likely pathogenic variants in genes consistent with the patients phenotype, was identified in 34.3% of the cohort; 61% of these variants had not been previously reported in databases such as ClinVar. By combining the two phases of the study, we reached a total of 167 patients, in whom the presence of NDD and/or MCA significantly increased the probability of achieving an informative ES outcome. In contrast, previous use of gene panel testing was associated with a decreased likelihood of receiving an informative result. Ancestry was not associated with diagnostic yield. ConclusionsThis study demonstrates the utility of ES in achieving a diagnosis in a clinically diverse cohort of Chilean patients with RDs, and characterized features associated with a higher diagnostic yield. These findings may contribute to evidence-based patient prioritization strategies in settings with limited access to NGS resources.

18
Variant curation of the largest compendium of FOXL2 coding and non-coding sequence and structural variants in BPES

Matton, C.; Van De Velde, J.; De Bruyne, M.; Van De Sompele, S.; Hooghe, S.; Syryn, H.; Bauwens, M.; D'haene, E.; Dheedene, A.; Cools, M.; Komatsuzaki, S.; Preizner-Rzucidlo, E.; Ross, A.; Armstrong, C.; Watkins, W.; Shelling, A.; Vincent, A. L.; Cassiman, C.; Vermeer, S.; Bunyan, D. J.; Verdin, H.; De Baere, E.

2026-03-02 genetic and genomic medicine 10.64898/2026.02.24.25339471 medRxiv
Top 0.1%
1.8%
Show abstract

Heterozygous FOXL2 (non-)coding sequence and structural variants (SVs) lead to blepharophimosis, ptosis and epicanthus inversus syndrome (BPES), a rare, autosomal dominant developmental disorder characterized by a completely penetrant eyelid malformation and incompletely penetrant primary ovarian insufficiency (POI). We collected variants from our in-house database, generated via clinical genetic testing and downstream research testing in the Center for Medical Genetics Ghent, Belgium (2001-2024), and via literature and other resources in the same period. All retrieved variants were categorized using ACMG/AMP classifications to increase the knowledge of pathogenicity. We collected 413 unique genetic defects of the FOXL2 region, including 76 novel variants, in 864 index patients. Of these, 87% of patients were identified with a coding FOXL2 sequence variant. The polyalanine tract is a known mutational hotspot of FOXL2, illustrated here by the high percentage of pathogenic polyalanine expansions (24%). Furthermore, the molecular spectrum in typical BPES index patients is characterized by 8% coding deletions and 3% deletions located up- and downstream of FOXL2. The remaining 2% carry translocations along with chromosomal rearrangements of 3q23. This uniform and structured reclassification, incorporating the largest dataset of variants implicated in FOXL2-associated disease so far, will improve both the diagnosis as well as genetic counselling for individuals with BPES.

19
Benchmarking 80 binary phenotypes from the openSNP dataset using deep learning algorithms and polygenic risk score tools

Muneeb, M. -; Ascher, D.; Myung, Y.; Feng, S.; Henschel, A.

2026-03-09 bioinformatics 10.64898/2026.03.06.710126 medRxiv
Top 0.1%
1.8%
Show abstract

Genotype-phenotype prediction plays a crucial role in identifying disease-causing single nucleotide polymorphisms and precision medicine. In this manuscript, we benchmark the performance of various machine/deep learning algorithms and polygenic risk score tools on 80 binary phenotypes extracted from the openSNP dataset. After cleaning and extraction, the genotype data for each phenotype is passed to PLINK for quality control, after which it is transformed separately for each of the considered tools/algorithms. To compute polygenic risk scores, we used the quality control measures for the test data and the genome-wide association studies summary statistic file, along with various combinations of clumping and pruning. For the machine learning algorithms, we used p-value thresholding on the training data to select the single nucleotide polymorphisms, and the resulting data was passed to the algorithm. Our results report the average 5-fold Area Under the Curve (AUC) for 29 machine learning algorithms, 80 deep learning algorithms, and 3 polygenic risk scores tools with 675 different clumping and pruning parameters. Machine learning outperformed for 44 phenotypes, while polygenic risk score tools excelled for 36 phenotypes. The results give us valuable insights into which techniques tend to perform better for certain phenotypes compared to more traditional polygenic risk scores tools.

20
Clinical evidence yield as a framework for evaluating computational predictors and multiplexed assays of variant effect

Shang, Y.; Badonyi, M.; Marsh, J. A.

2026-03-30 bioinformatics 10.64898/2026.03.27.714777 medRxiv
Top 0.1%
1.8%
Show abstract

Interpreting the clinical significance of missense variants of uncertain significance (VUS) remains a major challenge in clinical genetics. Although computational variant effect predictors (VEPs) and multiplexed assays of variant effect (MAVEs) can generate large-scale functional scores, their value is typically assessed using discrimination metrics such as AUROC rather than by the strength of evidence they provide under ACMG/AMP guidelines. Here, we introduce mean evidence strength (MES), a quantitative metric that summarises the pathogenic and benign evidence assigned across missense variants following gene-level Bayesian calibration. Using the acmgscaler framework, we calibrated 12 population-free VEPs across 367 disease genes and analysed 15 MAVE datasets with sufficient clinical data. MES revealed important discrepancies with AUROC, including cases where methods with similar discrimination differed substantially in evidence yield. MAVEs achieved high average MES despite lower AUROC, while several VEPs showed strong discrimination but more limited calibrated evidence. Among predictors, CPT-1 achieved the highest MES and provided moderate or stronger evidence for the largest fraction of ClinVar VUS. MES therefore provides a practical framework for evaluating computational and experimental variant effect datasets in terms of calibrated clinical evidence yield.